26 research outputs found
BayOTIDE: Bayesian Online Multivariate Time series Imputation with functional decomposition
In real-world scenarios like traffic and energy, massive time-series data
with missing values and noises are widely observed, even sampled irregularly.
While many imputation methods have been proposed, most of them work with a
local horizon, which means models are trained by splitting the long sequence
into batches of fit-sized patches. This local horizon can make models ignore
global trends or periodic patterns. More importantly, almost all methods assume
the observations are sampled at regular time stamps, and fail to handle complex
irregular sampled time series arising from different applications. Thirdly,
most existing methods are learned in an offline manner. Thus, it is not
suitable for many applications with fast-arriving streaming data. To overcome
these limitations, we propose \ours: Bayesian Online Multivariate Time series
Imputation with functional decomposition. We treat the multivariate time series
as the weighted combination of groups of low-rank temporal factors with
different patterns. We apply a group of Gaussian Processes (GPs) with different
kernels as functional priors to fit the factors. For computational efficiency,
we further convert the GPs into a state-space prior by constructing an
equivalent stochastic differential equation (SDE), and developing a scalable
algorithm for online inference. The proposed method can not only handle
imputation over arbitrary time stamps, but also offer uncertainty
quantification and interpretability for the downstream application. We evaluate
our method on both synthetic and real-world datasets
Dynamic Tensor Decomposition via Neural Diffusion-Reaction Processes
Tensor decomposition is an important tool for multiway data analysis. In
practice, the data is often sparse yet associated with rich temporal
information. Existing methods, however, often under-use the time information
and ignore the structural knowledge within the sparsely observed tensor
entries. To overcome these limitations and to better capture the underlying
temporal structure, we propose Dynamic EMbedIngs fOr dynamic Tensor
dEcomposition (DEMOTE). We develop a neural diffusion-reaction process to
estimate dynamic embeddings for the entities in each tensor mode. Specifically,
based on the observed tensor entries, we build a multi-partite graph to encode
the correlation between the entities. We construct a graph diffusion process to
co-evolve the embedding trajectories of the correlated entities and use a
neural network to construct a reaction process for each individual entity. In
this way, our model can capture both the commonalities and personalities during
the evolution of the embeddings for different entities. We then use a neural
network to model the entry value as a nonlinear function of the embedding
trajectories. For model estimation, we combine ODE solvers to develop a
stochastic mini-batch learning algorithm. We propose a stratified sampling
method to balance the cost of processing each mini-batch so as to improve the
overall efficiency. We show the advantage of our approach in both simulation
study and real-world applications. The code is available at
https://github.com/wzhut/Dynamic-Tensor-Decomposition-via-Neural-Diffusion-Reaction-Processes
Analysis of Multivariate Scoring Functions for Automatic Unbiased Learning to Rank
Leveraging biased click data for optimizing learning to rank systems has been
a popular approach in information retrieval. Because click data is often noisy
and biased, a variety of methods have been proposed to construct unbiased
learning to rank (ULTR) algorithms for the learning of unbiased ranking models.
Among them, automatic unbiased learning to rank (AutoULTR) algorithms that
jointly learn user bias models (i.e., propensity models) with unbiased rankers
have received a lot of attention due to their superior performance and low
deployment cost in practice. Despite their differences in theories and
algorithm design, existing studies on ULTR usually use uni-variate ranking
functions to score each document or result independently. On the other hand,
recent advances in context-aware learning-to-rank models have shown that
multivariate scoring functions, which read multiple documents together and
predict their ranking scores jointly, are more powerful than uni-variate
ranking functions in ranking tasks with human-annotated relevance labels.
Whether such superior performance would hold in ULTR with noisy data, however,
is mostly unknown. In this paper, we investigate existing multivariate scoring
functions and AutoULTR algorithms in theory and prove that permutation
invariance is a crucial factor that determines whether a context-aware
learning-to-rank model could be applied to existing AutoULTR framework. Our
experiments with synthetic clicks on two large-scale benchmark datasets show
that AutoULTR models with permutation-invariant multivariate scoring functions
significantly outperform those with uni-variate scoring functions and
permutation-variant multivariate scoring functions.Comment: 4 pages, 2 figures. It has already been accepted and will show in
Proceedings of the 29th ACM International Conference on Information and
Knowledge Management (CIKM '20), October 19--23, 202
The adaptation of Arctic phytoplankton to low light and salinity in Kongsfjorden (Spitsbergen)
The basic environmental variables and adaptability of phytoplankton communities to low light and salinity were studied using incubation experiments in Kongsfjorden, a high Arctic fjord of Spitsbergen, in late summer 2006. Chlorophyll a concentrations were steady or decreased slightly in darkness after one day or one week incubation. Chlorophyll a concentrations showed an initial decline when exposed to natural light after one week incubation in darkness, and then increased significantly. In a salinity experiment, the maximal growth rate was observed at a dilution ratio of 10%, however, higher dilution ratios (≥40%) had an obvious negative effect on phytoplankton growth. We suggest that the phytoplankton communities in fjords in late summer are darkness adapted, and the inflow of glacial melt water is favorable for phytoplankton growth in the outer fjords where the influence of freshwater is limited
Provably Convergent Schr\"odinger Bridge with Applications to Probabilistic Time Series Imputation
The Schr\"odinger bridge problem (SBP) is gaining increasing attention in
generative modeling and showing promising potential even in comparison with the
score-based generative models (SGMs). SBP can be interpreted as an
entropy-regularized optimal transport problem, which conducts projections onto
every other marginal alternatingly. However, in practice, only approximated
projections are accessible and their convergence is not well understood. To
fill this gap, we present a first convergence analysis of the Schr\"odinger
bridge algorithm based on approximated projections. As for its practical
applications, we apply SBP to probabilistic time series imputation by
generating missing values conditioned on observed data. We show that optimizing
the transport cost improves the performance and the proposed algorithm achieves
the state-of-the-art result in healthcare and environmental data while
exhibiting the advantage of exploring both temporal and feature patterns in
probabilistic time series imputation.Comment: Accepted by ICML 202
The adaptability of three Arctic microalgae to different low temperatures
In order to study the adaptability of Arctic microalgae to different environmental temperatures, the growth curves and antioxidase system of three microalgae (Skeletonema marinoi, Chlorella sp. and Chlamydomonas sp.) that were separated from the Ny-Ålesund, the high Arctic, at different low temperatures (0°C, 4°C and 8°C) were determined. The result showed that the adaptability of the microalgae to temperatures depended on the species. The growth rate, SOD and CAT activities of Skeletonema marinoi were the highest at 4°C, but MDA content was the lowest. The growth rate and enzyme activity of Chlorella sp. were the highest at 8°C, while the lowest MDA content presented at 0°C. The growth of Chlamydomonas sp. at the different temperatures was not so significant, the lowest MDA content presented at 8°C. The change of antioxidase system also depended on species and temperatures. Three indexes of antioxidase system of Skeletone mamarinoi between 0°C and 4°C showed extremely significant difference (p <0.01).SOD activity of Skeletonema marinoi and Chlorella sp. between 0°C and 8°C showed significant difference (p<0.05), and the other two indexes of them differed insignificantly. Antioxidase systems of Chlamydomonas sp. at the three temperatures differed insignificantly. In conclusion, the three microalgae had good adaptability to the three temperatures; their MDA content presented a low level, and had unique physiological mechanism to adapt to the environment with different low temperatures
Genome-wide identification and expression analysis of 3-ketoacyl-CoA synthase gene family in rice (Oryza sativa L.) under cadmium stress
3-Ketoacyl-CoA synthase (KCS) is the key rate-limiting enzyme for the synthesis of very long-chain fatty acids (VLCFAs) in plants, which determines the carbon chain length of VLCFAs. However, a comprehensive study of KCSs in Oryza sativa has not been reported yet. In this study, we identified 22 OsKCS genes in rice, which are unevenly distributed on nine chromosomes. The OsKCS gene family is divided into six subclasses. Many cis-acting elements related to plant growth, light, hormone, and stress response were enriched in the promoters of OsKCS genes. Gene duplication played a crucial role in the expansion of the OsKCS gene family and underwent a strong purifying selection. Quantitative Real-time polymerase chain reaction (qRT-PCR) results revealed that most KCS genes are constitutively expressed. We also revealed that KCS genes responded differently to exogenous cadmium stress in japonica and indica background, and the KCS genes with higher expression in leaves and seeds may have functions under cadmium stress. This study provides a basis for further understanding the functions of KCS genes and the biosynthesis of VLCFA in rice
Investigation and Analysis of Guangzhou Nansha Coast Park Point Source Pollution and Non-point Source Pollution
[Objective]To find out the situation of Nansha Coast Park point and non-point source pollution. [Method] By investigating the park water environment analysis of point and non-point source pollutants contribution rate, setting up water quality monitoring sites for basic data related indicators and then using national water quality standards to evaluate water quality.[Result]The Coast Park point source pollution mainly comes from the campus greeting fertilizer spraying. The COD of lakes and river outside the park and ammonia mean concentration belong to grade III. The total nitrogen of lake belongs to grade III. The total phosphorus belongs to grade IV. The total nitrogen of river is the worst. The total phosphorus is grade V. [Conclusion] The lake water quality is highly affected by the point and non-point source pollution, the quality of the river is worse than that of the lake in the park, and it needs powerful governance